Goto

Collaborating Authors

 feature neuron



cba76ef96c4cd625631ab4d33285b045-Paper-Conference.pdf

Neural Information Processing Systems

Learning disentangled and distributed representation ofgenerativefactors oftheworld isbelieved tobenefit compositional generalization, because those invariant features canbereused assymbols to build exponentially larger amounts of objects with higher complexity [1, 2, 3, 4].



Rectified Lagrangian for Out-of-Distribution Detection in Modern Hopfield Networks

arXiv.org Artificial Intelligence

Modern Hopfield networks (MHNs) have recently gained significant attention in the field of artificial intelligence because they can store and retrieve a large set of patterns with an exponentially large memory capacity. A MHN is generally a dynamical system defined with Lagrangians of memory and feature neurons, where memories associated with in-distribution (ID) samples are represented by attractors in the feature space. One major problem in existing MHNs lies in managing out-of-distribution (OOD) samples because it was originally assumed that all samples are ID samples. To address this, we propose the rectified Lagrangian (RegLag), a new Lagrangian for memory neurons that explicitly incorporates an attractor for OOD samples in the dynamical system of MHNs. RecLag creates a trivial point attractor for any interaction matrix, enabling OOD detection by identifying samples that fall into this attractor as OOD. The interaction matrix is optimized so that the probability densities can be estimated to identify ID/OOD. We demonstrate the effectiveness of RecLag-based MHNs compared to energy-based OOD detection methods, including those using state-of-the-art Hop-field energies, across nine image datasets.


Decoding specialised feature neurons in LLMs with the final projection layer

arXiv.org Artificial Intelligence

Large Language Models (LLMs) typically have billions of parameters and are thus often difficult to interpret in their operation. Such black-box models can pose a significant risk to safety when trusted to make important decisions. The lack of interpretability of LLMs is more related to their sheer size, rather than the complexity of their individual components. The TARS method for knowledge removal (Davies et al 2024) provides strong evidence for the hypothesis that that linear layer weights which act directly on the residual stream may have high correlation with different concepts encoded in the residual stream. Building upon this, we attempt to decode neuron weights directly into token probabilities through the final projection layer of the model (the LM-head). Firstly, we show that with Llama 3.1 8B we can utilise the LM-head to decode specialised feature neurons that respond strongly to certain concepts, with examples such as "dog" and "California". This is then confirmed by demonstrating that these neurons can be clamped to affect the probability of the concept in the output. This extends to the fine-tuned assistant Llama 3.1 8B instruct model, where we find that over 75% of neurons in the up-projection layers have the same top associated token compared to the pretrained model. Finally, we demonstrate that clamping the "dog" neuron leads the instruct model to always discuss dogs when asked about its favourite animal. Through our method, it is possible to map the entirety of Llama 3.1 8B's up-projection neurons in less than 15 minutes with no parallelization.


A Novel Feature Learning-based Bio-inspired Neural Network for Real-time Collision-free Rescue of Multi-Robot Systems

arXiv.org Artificial Intelligence

Natural disasters and urban accidents drive the demand for rescue robots to provide safer, faster, and more efficient rescue trajectories. In this paper, a feature learning-based bio-inspired neural network (FLBBINN) is proposed to quickly generate a heuristic rescue path in complex and dynamic environments, as traditional approaches usually cannot provide a satisfactory solution to real-time responses to sudden environmental changes. The neurodynamic model is incorporated into the feature learning method that can use environmental information to improve path planning strategies. Task assignment and collision-free rescue trajectory are generated through robot poses and the dynamic landscape of neural activity. A dual-channel scale filter, a neural activity channel, and a secondary distance fusion are employed to extract and filter feature neurons. After completion of the feature learning process, a neurodynamics-based feature matrix is established to quickly generate the new heuristic rescue paths with parameter-driven topological adaptability. The proposed FLBBINN aims to reduce the computational complexity of the neural network-based approach and enable the feature learning method to achieve real-time responses to environmental changes. Several simulations and experiments have been conducted to evaluate the performance of the proposed FLBBINN. The results show that the proposed FLBBINN would significantly improve the speed, efficiency, and optimality for rescue operations.


Making a Spiking Net Work: Robust brain-like unsupervised machine learning

arXiv.org Artificial Intelligence

The surge in interest in Artificial Intelligence (AI) over the past decade has been driven almost exclusively by advances in Artificial Neural Networks (ANNs). While ANNs set state-of-the-art performance for many previously intractable problems, the use of global gradient descent necessitates large datasets and computational resources for training, potentially limiting their scalability for real-world domains. Spiking Neural Networks (SNNs) are an alternative to ANNs that use more brain-like artificial neurons and can use local unsupervised learning to rapidly discover sparse recognizable features in the input data. SNNs, however, struggle with dynamical stability and have failed to match the accuracy of ANNs. Here we show how an SNN can overcome many of the shortcomings that have been identified in the literature, including offering a principled solution to the dynamical "vanishing spike problem", to outperform all existing shallow SNNs and equal the performance of an ANN. It accomplishes this while using unsupervised learning with unlabeled data and only 1/50th of the training epochs (labeled data is used only for a simple linear readout layer). This result makes SNNs a viable new method for fast, accurate, efficient, explainable, and re-deployable machine learning with unlabeled data.


Large Associative Memory Problem in Neurobiology and Machine Learning

arXiv.org Machine Learning

Dense Associative Memories or modern Hopfield networks permit storage and reliable retrieval of an exponentially large (in the dimension of feature space) number of memories. At the same time, their naive implementation is non-biological, since it seemingly requires the existence of many-body synaptic junctions between the neurons. We show that these models are effective descriptions of a more microscopic (written in terms of biological degrees of freedom) theory that has additional (hidden) neurons and only requires two-body interactions between them. For this reason our proposed microscopic theory is a valid model of large associative memory with a degree of biological plausibility. The dynamics of our network and its reduced dimensional equivalent both minimize energy (Lyapunov) functions. When certain dynamical variables (hidden neurons) are integrated out from our microscopic theory, one can recover many of the models that were previously discussed in the literature, e.g. the model presented in ''Hopfield Networks is All You Need'' paper. We also provide an alternative derivation of the energy function and the update rule proposed in the aforementioned paper and clarify the relationships between various models of this class.


Perception Coordination Network: A Framework for Online Multi-Modal Concept Acquisition and Binding

AAAI Conferences

A biologically plausible neural network model named Perception Coordination Network (PCN) is proposed for online multi-modal concept acquisition and binding. It is a hierarchical structure inspired by the structure of the brain, and functionally divided into the primary sensory area (PSA), the primary sensory association area (SAA), and the higher order association area (HAA). The PSA processes many elementary features, e.g., colors, shapes, syllables, and basic flavors, etc. The SAA combines these elementary features to represent the unimodal concept of an object, e.g., the image, name and taste of an apple, etc. The HAA connects several primary sensory association areas like a function of synaesthesia, which means associating the image, name and taste of an object. PCN is able to continuously acquire and bind multi-modal concepts in an online way. Experimental results suggest that PCN can handle the multi-modal concept acquisition and binding problem effectively.